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QUESTIONS FOR THE COMMITTEE 


Do you agree that the full model is a better choice than the other models? Is any 
improvement possible to the model? 


Do you agree with the approach of standard error estimation for overall impact? 


Can we improve the estimation of overall impact by considering bias-correction 
for the back-transformation? 


Is ignoring correlation between outgoing (wave 8) and incoming (wave 1) 
rotation groups acceptable? 


Is using constant sampling error variance across time and across waves a 
reasonable assumption? 
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ABSTRACT 


The ABS is embarking on a transformation program, which includes a change in ABS 
systems (editing, imputation, estimation, etc.), a use of different collection modes and a 
change in sampling frames etc. Once this transformation is completed, it is expected to 
deliver positive changes to the production of official statistics. However, there is also a 
risk of introducing impacts on some ABS time series. The ABS is working on developing 
methods to measure, and where needed to adjust for any impacts on such time series. 


This paper discusses the usage of state space modelling approach to measure statistical 
impacts under two scenarios: 1) having a small parallel collection and then 
incrementing (phasing-in) the new approach to the survey, 2) no parallel collection is 
available and a new approach is introduced gradually. We consider a few models in this 
study, including the difference model for parallel collection, phase-in model with and 
without Kalman filter initialisation using estimated impacts from parallel collection 
initially presented at the MAC May 2017 and finally, the full model that utilises both 
parallel collection and phase-in period information. The full model builds on feedback 
led by Professor Hyndman at the MAC May 2017. The models’ performance is evaluated 
from the results of a simulation study for the Australian Labour Force Survey (LFS). Full 
model with parallel collection performs the best in terms of the power of detecting 
overall impacts when compared to the other models. 
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1. INTRODUCTION 


The ABS is embarking on a transformation program, which includes a change in ABS 
systems (editing, imputation, estimation, etc.), a use of different collection modes and a 
change in sampling frames etc. Once this transformation is completed, it is expected to 
deliver positive changes to the production of official statistics. However, there is also a 
risk of introducing impacts on some ABS time series. The ABS is working on developing 
methods to measure, and where needed to adjust for any impacts on such time series. 


The ABS uses a three stage approach to statistical impact management. The first stage 
involves stakeholder engagement, risk and project management, pre-testing, analysis, 
dissemination planning and other tools to manage statistical impacts from process 
changes. The second stage requires data collection from a trial or experiment and 
statistical modelling in order to estimate the magnitude of the impacts and to test their 
significance (impact measurement). Once an impact has been detected, there is a need 
for impact adjustments and efficient dissemination strategy (stage 3). This paper focuses 
on the second stage, which is called statistical impact measurement (SIM). 


The choice of method for statistical impact measurement depends on a few factors such 
as the expected changes in the survey, availability of resources, the clients’ requirement 
for the minimum detectable impact and in case of multiple changes, whether it is 
possible to measure the separated effect for each survey change or the accumulative 
effect of all changes. If the expected changes (such as changes of processing, 
imputation or estimation) do not affect raw unit-level data then impacts can be detected 
by running the old and new systems in parallel and then comparing estimates from the 


two systems. However, if changes are expected at the raw unit-level data (such as, 
change of survey mode) then impact measurement becomes more challenging. The 
most accurate impact estimates in this case can be obtained by using methods of 
embedded experiments that require a parallel collection. Under this approach, the old 
and new surveys are run in parallel for a period of time and this allows simultaneous 
comparison of estimates from the survey under the old and new approach. However, 
this approach is very costly and requires lots of resources and thus, it is often not 
feasible. 


When parallel collection is not possible, there are other methods for SIM that can be 
loosely categorised into two groups: SIM methods with unit-level data and SIM methods 


with aggregated level estimates. The first group includes methods of matching (on 
covariates, propensity scores etc.) and unit-level modelling. The second group consists 
of time series modelling. In this paper, we will consider time series approach to SIM. 
We will make use of multivariate structural time series models in state space form. 
Different versions of this approach are considered in earlier studies by Pfeffermann 
(1991), Van Den Brakel (2009, 2010). Our initial research demonstrates that this 
approach can potentially provide accurate estimates of impacts using data from parallel 
collection and phase-in periods (Zhang et. al, 2018). In this study, we propose a 
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different state space model that has practical benefit as it can be used for both parallel 
collection and phase-in periods. In addition, it can measure impacts on the LFS series 
more precisely. The initial idea of this model was suggested by Professor Rob Hyndman 
following the MAC meeting in May 2017. We have extended his proposed model to take 
into account the sampling error structure of the LFS series and the parallel collection 
design. 


We explore the state space modelling approach to measure impacts on Labour Force 
Survey (LFS) time series as permanent level shifts. Changes to LFS might introduce 
other types of impacts such as temporary effects and changes in seasonal pattern. 
However, we do not consider measuring such impacts in this paper because although 
state space models can potentially handle such impacts, it would take a long time for the 
model to estimate it. It is expected that the SIM model is capable of measuring impacts 
during a short period of time such as during parallel collection and phase-in period. 


We are focusing on the measurement of impacts on the national LFS employment / 
unemployment time series. Impact measurement for lower aggregated time series such 
as employment / unemployment series by states, sex and age groups will be considered 
in the next study. Although the main interest here is to measure impacts on national 
estimates, impact measurement modelling has to be conducted at the wave (time-in- 
survey) level to take into account the fact that impacts are potentially time-in-survey 
sensitive. Note that impacts have to be detected on composite estimator series (the 
estimator used in the LFS). However, it is impossible to create time series at the wave 
level from the composite estimator. This is because composite estimator is a weighted 
average of 56 estimates from 8 rotation groups in the current and previous 6 months. 


Therefore, we fit a model to GREG estimator time series at the wave level and then 
combine detected impacts into an overall impact and adjust composite estimator time 
series by the size of detected impacts. 


Section 2 provides a brief introduction of the characteristics of the current ABS LFS 
survey, possible future changes and basic structural time series model in state space 
form on real LFS data before the transformation. Section 3 describes the models that 
could be used for impact measurement in the LFS during parallel collection if it is 
available and phase-in periods. Section 4 presents an integrated approach for impact 
measurement with full model. Section 5 describes a data simulation process for model 
evaluation and method of estimating an overall impact and its standard error. Section 6 
discusses performance of different models on simulated data. Finally, Section 7 gives a 
brief discussion of findings and concluding remarks. 


All the calculations reported in this paper are carried out with programs written in the 
SSM procedure in SAS and the DLM package in R. 
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2. ABS LABOUR FORCE SURVEY (LFS) 
2.1 LFS design and estimation 


The Australian Labour Force Survey (LFS) is a monthly survey with approximately 26,000 
households selected in the sample. This covers approximately 0.32% of the civilian 
population of Australia aged 15 years and over (ABS, 2018). A multi-stage survey design 
is used to select households in the sample with clustering or stratification of dwellings at 
each selection stage. Households that are selected for the LFS participate in the survey 
for eight consecutive months and then they are replaced by their next-door neighbour 
household. Thus the LFS sample comprises eight sub-samples (or rotation groups, RG 
hereafter) with each sub-sample remaining in the survey for eight months. Each month 
one rotation group "rotates out" and is replaced by a new group "rotating in". This is 
known as rotating panel design. 


Strong overlap between samples in any two successive months (about seven eighths) 
induces a strong serial correlation in the sampling error. In fact, sampling error has a 


degree of positive correlation even between rotated-out and rotated-in groups because 


the replacement sample is selected from the same geographic areas as the outgoing one. 
This induces additional serial correlation in the sampling error even for non-overlapping 
outgoing and incoming RGs. 


Traditionally, the first interview is generally conducted face-to-face and subsequent 
interviews are conducted by telephone. From December 2012 to April 2013, the ABS 
conducted a trial of online electronic data collection. Respondents in a single rotation 
group (i.e. one-eighth of the survey sample) were offered the option of self-completing 
their LFS questionnaire online instead of via a face-to-face or telephone interview. In 
May 2013, the ABS expanded the offer of online electronic collection to 50% of each new 
incoming rotation group. Since September 2013, online electronic collection has been 
offered to 100% of private dwellings in each incoming rotation group. Since April 2014, 
100% of private dwellings across the sample have been offered online electronic 
collection. During the implementation period, the ABS investigated possible impacts of 
mode changes on LFS estimates using different statistical impact measurement 
techniques such as matching on covariates, propensity score matching, longitudinal 
analysis methods as well as state space modelling (later on) and no impacts on the LFS 


estimates were detected. 


The estimation method in the LFS is composite estimation. By exploiting the high 
correlation between overlapping samples across months, the LFS composite estimator 
combines the current month data with previous months’ data by applying different 
factors (called BLUE multipliers) based on the length of time a rotation group has been 
in the survey. After these factors are applied, seven months of data (current and 
previous six months) are weighted to align with known current month population 
benchmarks. 
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There are systematic differences in LFS estimates for the subsequent waves (month-in- 
survey) particularly for unemployment. The level of unemployment declines if the 
rotation group is in the survey for a longer period of time. In the literature, this is 
referred to as rotation group bias (RGB). This fact is well known in other countries, e.g. 
in the U.S. (Bailar, 1975), Canada (Kumar and Lee, 1983, Binder and Hidiroglou, 1988) 
and the Netherlands (Brakel and Krieg, 2009). The RGB has to be taken into account 
when impact measurement is conducted at the rotation group level due to e.g. the 
introduction of time-in-survey sensitive changes to the survey. 


In some calendar months the LFS has supplementary surveys conducted in conjunction 
with the LFS. Some of labour-related supplementary surveys affect original labour force 


estimates. The identified impacts are adjusted by the ABS in seasonally adjusted series 
(but not in original series). There are also known cases when changes in supplementary 
survey program have additional temporary impact on the LFS estimates, e.g. in August 
2014 when changes in supplementary survey program affected the seasonal pattern in 
employment series (see ABS, 2014). If the supplementary survey program changes 
further, then an impact of this change on the LFS series is almost certain. 


2.2 Different types of transformative change 


There is a broad range of potential future changes that are being explored for the LFS, as 
the ABS continues to modernise its statistical methodology. They have been 
conceptualised, for the purpose of this paper, as falling within a three stage model, 
according to the extent to which they represent a change from the current design of the 


survey, and the likely level of risk of a statistical impact. 


e Changes that have been conceptualised as falling within a first stage are either 
regular updates to the survey design (e.g. a new sample design, which for 2018 
includes the use of the Address Register) or have controllable impacts (e.g. 
estimation methodology changes). There is no need for SIM for those changes. 


e Changes in a second stage would be those tied to major changes in the systems 
and infrastructure, and related processes. Impacts due to such changes are rather 
unlikely. For managing statistical risk purposes, parallel processing and pretesting 
with small size sample are recommended. 


e Changes in a third stage would be those focused on the long-term sustainability of 


the LFS. They would involve a greater degree of change that could result in 
considerable changes in respondent behaviour — with likely mode and seasonality 
effects. The two such changes would be optimising the LFS questionnaire around 
e-collection as the primary mode for collection and in the placement and timing of 
supplementary surveys (except for a scenario where supplementary surveys were 
all removed entirely). Design around e-collection as the primary mode implies the 
potential for changes to the questions asked and the number of questions, 


respondent induction and follow-ups and therefore, there might be time-in-survey 
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sensitive effects. This implies the impacts could vary the longer the people stay in 
survey. There may be a uniform increase in the time-in-survey and it is expected 
that the drop off rate of e-collection across months would be much smaller than 
under the current approach. Both changes could have significant impacts on the 
LFS estimates, particularly changes in supplementary surveys program (see 2.1). 
Supplementary survey program changes would also be expected to be time-in- 


survey sensitive. We refer to this as “wave sensitive” in this paper. 


2.3 Modelling LFS time series at the rotation group level 


Let us first consider the structural time series model in state space form for employment 


/ unemployment time series at the wave level before the ABS transformation. Assume 


y,;, is a GREG estimate of a main LFS variable such as the number of employed and the 


number of unemployed persons from the rotation group that is observed for the 7th 
time in survey (¢=J, ..., 8) (refer to as wave 7 hereafter) in the current month ¢. Note 
that y,, are obtained by using the GREG estimator with calibration of each rotation 


group estimates to the total known population benchmarks. We can use the following 


eight-dimensional multivariate structural time series model for modelling series y,, 1. 
Y, =Tig (7 +5, +1,)+b+e, (2.1) 
or in an expanded form: 


Vit b, ery 
: =H (+5, +4.) +] 5 [+] 3 (2.2) 
Yet b, €s + 


Note that all eight series share the same trend, seasonal and irregular component 
because the estimate y,, is obtained from independent subsamples of general 


population (rotation groups) and therefore, they are just different measures of the same 
true value of the Australian employment / unemployment level. Trend, seasonal and 
irregular component (together called “signal”) are the components in the model that 
measure real world changes in the employment / unemployment series. The other two 
(multivariate) components, rotation group bias and sampling error are related to survey 
errors that do not reflect real world changes. It is important to separate the sampling 
error component from the other stochastic components in the model because there is a 
high chance of identifiability problem. To address this issue, we use the unit-level data 
from the LFS to estimate parameters of the sampling error (SaE) component and then 
use them in the state space model (SSM). We describe components of the model 
together with their state equations in a greater detail below. The Kalman Filter (KF) and 


' We denote vectors by bold small letters and matrixes by bold capital letters 
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smoother are used for the estimation of latent variables (states) in the model. The 
model’s parameters are estimated by using Maximum Likelihood Estimation (MLE). 


a) Trend component 7, is modelled as smooth trend (deterministic trend level T, 
and stochastic trend slope R,) with the following state equation: 
T=T_,+R_, (25) 
R=R,.+6, 6,=NID(0,07). (2.4) 
b) Seasonal component S, is modelled as stochastic trigonometric form: as: 


| 3/2 | 
S.= Diet See» (2.5) 
where S,, is the nonstationary stochastic cycle that has the following state 


equation: 
Si, _ eo Pits) sin(2kz/s) Siac 7 Os | 2.6) 
S.,| |—sin(2kz/s) cos(2kz/s) || S..4 | |, 

with k =1,2,...,] 5/2 | and @,, ~ NID(0, oI) with E| a%,.@,, | =0 fork 41. 


c) Irregular component J, is modelled as white noise 1,=¢, ¢, = NID(0, o:) 


The initial condition for trend, seasonal and irregular component is diffuse prior 
(zero initial state value and infinite state variance). 


d) Rotation group bias (RGB) effects b,,b,,b,,b,,b;,b;,b, are permanent level shifts in 
LFS estimates for waves 1,2,3,4,5,6,8 compared to wave 7 (b, =0) respectively. An 
assumption of deterministic RGB was tested on real data. Firstly, we let RGB for 
each wave to be time varying (random walk). However, variance of RGB 
disturbance was almost zero. This supports an assumption of deterministic RGBs. 
The initial condition for RGB states is also diffuse (zero initial state value and 


infinite variance). Note we assume that RGB effects are independent so 
covariances in the initial state variance-covariance matrix are all zeros. 


e) Sampling error (SaE) e,, is modelled as white noise for wave 1 assuming no 


correlation with previous rotated out rotation group, AR(1) for wave 2 assuming 
correlation of rotation group that is second month in survey with its previous 
month estimate and AR(2) for waves 3-8 assuming correlation with previous two 
months estimates: 


€,=u,, uy, = NID(0, O;, ) 
_ ~w 2 

Cy, = PC, +Uy,, Uy, = NID(0,o; ) (2.7) 
= ae 2 

Cs Pei ¢-1 ce Pr&;_21-2 + Ui, Ui, = NID(O, 01, 


* Trigonometric form of seasonal was used in SAS. In R we used a dummy variable form and it (almost) did not 
make any difference in the results 
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We assume there is no correlation between wave one (new rotation groups) and 
previous wave eight (rotating out RGs) although we know that there is some 
correlation between them due to the fact that both old and new RGs are selected 
from the same small geographical area. We found that ignoring this correlation 
does not influence the modelling results much but it makes it simpler to model 
SaE. Note that SaE within each wave is uncorrelated due to the fact that estimates 


within each wave belong to different RGs which are independently selected. There 
is also no correlation across waves at time point ¢ for the same reason. The only 
SaE correlation is between SaE from the current wave and the previous wave (or 
two waves) in the previous time point(s). In state space form, SaE is rewritten as: 


me ies ®, ®, | G4 ft uy (2.8) 
C5 I 0O je, 0 
where e, and u, are 8-dimentional vectors of SaEs and SaE disturbances 


respectively and ®, and @®, are blocks in SaE transition matrix that have AR 


coefficients at lower and two level lower diagonals? 


00 0... 0 0 00... 0 
g 0 0... 0 0 00... 0 

®,=|0 g 0 ... 0|/=LDiag(9,),®,=|9, 0 0 ... 0)/=L’Diag(g,).(2.9) 
00 0... 0 0 00... 0 


AR coefficients and SaE variances are estimated from the LFS unit-level data in the 
following way. Firstly, we estimated autocorrelations (AC) with the method of 
pseudo-errors suggested by Pfeffermann et. al. (1998) and also with GREG 
weighted residuals method (see details in Appendix 1). Both methods gave almost 
identical results. RG autocorrelations of different lags (1-7) are estimated for every 
month across ten years and then they are combined into wave autocorrelation 
series. As they do not vary much with time, we average them across time for each 
combination of wave and lag. From Table 2.1, it can be seen that autocorrelation 
depends on lag and it does not vary much across waves (for the same lag). 
Therefore, we averaged them across waves. The corresponding partial 
autocorrelations (PAC) are only significant at lag 1 and 2. This let us conclude 
AR(2) process is sufficient for modelling wave level SaEs. The PACs and 
corresponding AR coefficients are obtained by solving the Yule-Walker equations. 
Thus, the estimated AR coefficients are @=0.589, og, =0.466 and g, =0.208 for 


unemployment and g=0.835, g, =0.585 and g, =0.3 for employment. 


* Diag(x) denotes a diagonal matrix with values from vector x on the main diagonal; LDiag(x) denotes a lower 
diagonal matrix, i.e. with values x on the entries below the main diagonal, and UDiag(x) is a corresponding 
upper diagonal matrix; L’Diag(x) denotes a two levels lower diagonal matrix and U’Diag(x) isa 
corresponding upper diagonal matrix. 
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2.1 Autocorrelations and partial autocorrelations for wave SEs 


a) unemployment 


lag 1 lag 2 lag 3 lag 4 lag 5 lag 6 lag 7 
wave 2 AC 0.555 
PAC 0.555 
wave 3 AC 0.545 0.452 
PAC 0.545 0.220 
wave 4 AC 0.475 0.443 0.409 
PAC 0.475 0.281 0.174 
wave 5 AC 0.622 0.442 0.487 0.382 
PAC 0.622 0.091 0.296 -0.054 
wave 6 AC 0.694 0.505 0.352 0.450 0.393 
PAC 0.694 0.044 -0.027 0.395 -0.120 
wave 7 AC 0.557 0.535 0.487 0.456 0.399 0.385 
PAC 0.557 0.326 0.169 0.110 0.025 0.051 
wave 8 AC 0.674 0.519 0.503 0.439 0.468 0.336 0.356 
PAC 0.674 0.118 0.214 0.026 0.203 -0.193 0.210 
mean AC 0.589 0.483 0.447 0.432 0.420 0.361 0.356 
PAC 0.589 0.208 0.160 0.131 0.105 0.005 0.064 
b) employment 
lag 1 lag 2 lag 3 lag 4 lag 5 lag 6 lag 7 
wave 2 AC 0.820 
PAC 0.820 
wave 3 AC 0.867 0.798 
PAC 0.867 0.187 
wave 4 AC 0.861 0.829 0.762 
PAC 0.861 0.338 -0.008 
wave 5 AC 0.782 0.761 0.773 0.663 
PAC 0.782 0.383 0.322 -0.133 
wave 6 AC 0.865 0.749 0.722 0.693 0.596 
PAC 0.865 0.003 0.291 0.020 -0.191 
wave 7 AC 0.852 0.779 0.701 0.693 0.675 0.553 
PAC 0.852 0.196 0.001 0.229 0.085 -0.391 
wave 8 AC 0.801 0.815 0.740 0.651 0.612 0.568 0.485 
PAC 0.801 0.483 0.059 -0.201 -0.019 0.119 -0.100 
mean AC 0.835 0.788 0.740 0.675 0.627 0.560 0.485 
PAC 0.835 0.300 0.102 -0.035 0.002 -0.074 -0.105 


SaE variance is also calculated for each RG in each month for the same period of 
time by GREG weighted residuals method, which is applied to the unit-level LFS 
data. SaE variance is relatively consistent across time apart from a small shift 
during the sample redesign in 2013 and the short period in 2008-2009 when the 
LFS experienced a significant reduction in sample size. SaE variance (on relative 
scale) is then combined into waves and averaged across time to form relative 
standard errors (RSEs) (Table 2.2). As it can be seen, the RSEs are very consistent 


2 2 2 2 203 
across waves. Therefore, we assume 0) =O), =..=O0% =o, where o, is an 


average of estimated RSEs on log scale o7 = (In(RSE + 1)) ; 
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2.2 RSEs by wave, % 


1 2 3 4 5 6 7 8 Overall 
unemp 6.3 6.5 6.5 6.6 6.6 6.6 6.7 6.7 6.6 
emp 0.9 0.9 0.9 1.0 0.9 0.9 0.9 0.9 0.9 


: 2. . : : . 
SaE variance o; is used in SaE disturbance covariance matrix. For wave 1, SaE and 


SaE disturbance variances are identical as SaE is just white noise in wave 1. For 
other waves, SaE variance is scaled to take into account SaE autoregressive 


processes (see equations 2.7-2.8). SaE disturbance variance-covariance matrix is: 


0 
Qvar(u,)=|2 °) 2.10) 
0 0 
o, 90 0 0} jo 0 0 0 
0) Oy, 0) 0) 0 no, 0) 0 
Q=|9 0 «oF 0]=|}0 O- x02 0 |=o:Diag(y) (2.11) 
. O . 0 
0 O QO «+: Oy, 0 0 QO «ss V0, 
where loading factors are denoted as 
1, wave | 1, wave | 
Y=", wave2 =4(1-¢"), wave2 (2.12) 


Y,, waves 3-8 P0199? |, Liebeie 8 


For the Kalman Filter initialisation of SaE state, it should be taken into account that 
SaE is a stationary process so the initial condition is exact (non-diffuse) namely 
zero initial state value with the initial state variance-covariance matrix as the 


following: 
e P, P I LDiag (1 
va ‘|-R-| = |-6: a 2.13) 
en Pe Ps gUDiag (1) I 
aS x. 0 
Peal . = tet (2.14) 
0 a 
0 0 O :: O 00 0 ::- O 
go 0 O + 0 1 00 :: O 
Po =| 0 go, 0 ++ O]=go;/0 1 0 ++ 0)=go;LDiag(1) (2.15) 
. O . 0 
0) 0 O -:: O 00 0 ::- O 
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; : 2 2 2 
where lag 1 autocorrelation @ and SaE variance o, =...=o, =o, are known. 


Note that the initial SaE variance-covariance matrix has non-zero variances on the 
main diagonal and covariances on the lower diagonal to take into account the 
dependency of states across one-month lagged waves. 


Model hyperparameters (o; ao. Oo: ) estimated by MLE are presented in Table 2.3. 


2.3 ML estimates of model hyper-parameters (on log scale) 


2 2 2 

O- om 0% 

unemp 4.53E-05 7.00E-07 2.60E-04 
emp 6.21E-08 1.99E-08 1.90E-06 


The residuals diagnostic tests of the above model show that it satisfies the assumption of 
no serial correlation, homoskedasticity and normality with reasonable trend, seasonal 
and irregular estimates. 


3. STATISTICAL IMPACT MEASUREMENT (SIM) MODELS FOR PARALLEL 
COLLECTION AND PHASE-IN 


3.1 SIM model for parallel collection — difference model 


If parallel collection is a possible option for SIM (as it is potentially in the LFS), then 
there will be two sets of series available during the parallel collection period — from the 
old and new approaches. We refer to them hereon as control and treatment series 


respectively. Let us denote the series for control sample as yy and treatment series as 
y, . Control series are modelled as 


Ye =Tgs (7 +S, +7,)+b+ep Gy 


and treatment series have a different SaE component and an additional regression 
component that introduces an impact: 


Y, =Tee(Z+5,+1,)+bt+a+e . (3.2) 


Subtracting control series from treatment, we obtain a difference series y? =y; —y; that 


have only two components — impact @ and the difference between treatment and 


control SaE e? =e! —ef : 
yy =a+e, (3.3) 
or in an expanded form: 


yi, a ery 
: f=] i fa]: (3.4) 


D 
Vs As a 
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Note that because treatment and control series share the same trend, seasonal, irregular 


and RGB, they disappear in the difference series. We refer to the model (3.3) as the 


difference model (Zhang et. al., 2018). We now consider the components in the model 


in more detail. 


a) 


b) 


Impacts a, are modelled as permanent wave-specific level shifts. 


Initial condition for KF is diffuse prior namely zero initial value and infinite 
variance. Note that covariances in the initial condition are zeros, i.e. an 


assumption of independence of impacts across waves is made. 


State equation for difference in SaE can be written as: 


e, _|®, ®, ei || Ue 25 
er | | 1 0 |e, 0 | ae 


where matrix-blocks ®, and ®, are the same as the real data model (eq. 2.9) and 
uy is a vector of difference in SaE disturbances (u? =u; —ur ). 

Disturbance variance-covariance matrix for difference in SaE has variances of 
difference in SaE disturbances on the main diagonal of the upper left matrix-block: 


D 
D Q; 0 

= 3.6 
Q b 4 (3.6) 

o, 0 0 0 

0 o, 0 0 
Q=|0 0 «a, 0 =(1/«" +1-20/ lk” |o;Diag(y) (3.7) 

0 


where «’” is a ratio between sample size of treatment and control group during 
parallel collection. 


Note that the variance of difference in SaE disturbance is calculated as 


C,=6,40 ¢-2p07,0 -<=7,.0, 1x +70, 2p 6, (NK = 
=(1/ +1-2p/ VK" ozy 


where p is the correlation between sampling error of treatment and control 


sample. This correlation is non-zero if treatment and control samples are selected 
from the same small geographical areas. 


KF prior for difference in SaE states has zero initial values and the following initial 
state variance-covariance matrix: 
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(3.8) 


— I pLDiag (1) | 
= 1/ PR 1-2 / | r*) 2 
| weep ie 1 a I 


with variance of difference in SaE on the main diagonal and lagged one covariances 


across waves. As before, autoregressive coefficients @, @, and @,, the ratio of 


sample size of treatment and control sample «’“ and SaE variance o? are known 
(see 2.3). However, correlation between SaE of treatment and control sample p is 


unknown and it needs to be estimated. 


Therefore, the only parameter that needs to be estimated by MLE in this model is the 
correlation between sampling error of treatment and control group p. Other (hyper-) 
parameters are known. Note that this is different to Zhang et. al. (2018) where 


difference in SaE disturbance variance oa rather than p is estimated by MLE in the 
model. 


The difference model can be used for estimation of an impact during parallel collection 
but it is not suitable for cases with no parallel collection or after parallel collection when 


impact estimates can still be improved by using only treatment series. 


3.2 SIM model for phase-in with no parallel collection 


As noted above the difference model is not suitable in cases with no parallel collection 


or after parallel collection because the difference series are defined only in periods 
where both control and treatment series are available. Therefore different models are 
needed for those cases. 

In the case with no parallel collection, the LFS time series consists of the old (control) 
series yy up to the introduction of the new approach and the new (treatment) series y; 


continues after that. The model for such combined series can be formulated as: 
Ye =Tigg (7 +5, +7, ) +b, +e +x, (3.9) 
or in an expanded form: 


L b L 
Vig ! e 
: = Tiss) (Z, +S,4+7,)+} 2 ]+} i f+): f, (3.10) 


L L 
gt b, a AgXs , 


Yi) 4 


C . : 
. |Jj, before wave i phase-in 
. . . ) 
y,, since wave 7 phase-in 
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Xi 


2 before wave i phase-in 


1 since wave 7 phase-in 


Note that dummy variables x,, are different for all waves because it is expected that the 


new approach will be introduced incrementally by one rotation group a month. This 
process is called phase-in (see 5.1 for details). 


Trend, seasonal, irregular component and RGB are modelled as described in 2.3 (real 
data model) and the impact component is the same as the difference model (see 3.1). 
We describe the SaE component in more detail below. 


a) 


SaE state equation is: 


fe eye] ya = 
ery | 1 0 eV, 0| Co 


Disturbance variance-covariance matrix for SaE has variance of SaE disturbance on 
the main diagonal of the upper left matrix-block: 


Q* = . : (3.12) 
oO. lk, 0 0 
) 1,0, Kk, 0 0 
Q=| 0 0 02 IK;, 0 |=o2Diag(y/«K) (3.13) 
0 
0 0) 0 vee 120, Khas 


where k is a vector of dummy variables «,, specified as: 


x if wave i has treatment group (phase-in) at time f 
] otherwise 


and «’’ is the ratio between sample size of treatment and control group during 
phase-in. 

Initial condition for KF is the same as real data model, namely zero initial state 
value and initial state variance-covariance matrix (see eq. 2.13): 


er ; I gLDiag (1) 
— P = P = P . 
var} > =P, =: ile (1) , (3.14) 


t-1 


The hyperparameters estimated by MLE are the same as the ones estimated in the model 


for real data (ie. OF, oO, 07). 


ABS e EXPLORATION OF STATE SPACE MODELLING APPROACHES FOR STATISTICAL IMPACT MEASUREMENT IN ABS TIME SERIES: 


THE LABOUR FORCE SURVEY AS A CASE STUDY 14 


ABS METHODOLOGY ADVISORY COMMITTEE « JUNE 2018 


3.3 SIM model for phase-in with parallel collection 


In the case with parallel collection is available, estimates of impacts can still be improved 


after the phase-in period commences even without having the control series. The 
model for impact measurement in this case is exactly the same as the phase-in model 
(see eq. 3.9). However, information about the size of impacts obtained from the parallel 
collection can be used for the KF initialisation of the impact states. 


Unlike the difference and phase-in model, an initial condition of the impacts is non- 


diffuse. The difference model estimates of the impacts and the respective variance- 
covariance matrix are taken as the initial condition for KF. Note that covariance of the 


impacts is not necessarily zero as it is for the difference model. 


4. STATISTICAL IMPACT MEASUREMENT WITH THE FULL MODEL 
4.1 Full model for SIM 


The obvious disadvantage of the difference model is that it is only applicable when there 
is a parallel collection and a different model has to be used after that. However, the full 
model overcomes this disadvantage as it uses both control and parallel collection 
information. It performs significantly better than the difference model in terms of 
impact detection as it will be shown in section 5. The following shows the model 
specification of the full model: 


vy. e 
_ |= T+S,+1,)+b+} .' 4.1 
i [16x16] ( ) F + . ( ) 
or in an expanded form: 
vi, b, 0 er, 
C C 
¥g t b, 0 €, t 
aa | T.+S8,+1,)+ + +) 2 I, 42 
i, [16x16] ( t t t ) b, at, en ( ) 
1 b, Xs e, 


Note that treatment series have missing values before parallel collection (or before 
phase-in in the case of no parallel collection) and control series have missing values 
since phase-in commences. 


The trend, seasonal, irregular and RGB component are modelled the same as 2.3 and the 
impact component is modelled in the same way as the difference model (see 3.1). We 
describe the SaE component, which is different from the models described above, of the 
full model as follows: 


a) SaE state equation is: 
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er 9, 0 ®, 0O |e] | ul 
e, = 0 9 0 OD, Je, a u; 
| }1 0 0 Oo feS,| | 0 
e | 10 1 0 0 jet,| | 0 


(4.3) 


The variance of SaE disturbance for the control and treatment group is on the main 


diagonal in the variance-covariance matrix of the disturbance. 


Note that the 


variance of SaE disturbance for treatment group needs to be scaled to take into 
account the fact that the sample size of treatment group (particularly during 
parallel collection period) can be smaller than the sample size of control sample. 


Note that Qf =Q,. 


QF QF 0 0 Diag (7) pDiag(y/ Vc *) 0 0 

QF = Q, > 0 0 =o? pDiag(y/ Vc *) Diag(y/K*) 0 0 
: oe 0 0 0 0 

° _ oe 0 0 0 0 


where K* is a vector of variables «,, specified as: 


1,t 


x if wave i has treatment group (parallel run) at time f 


kK, =k if wave i has treatment group (phase-in) at time t_ , 


1 otherwise 


(4.4) 


KF initial condition for SaE states is the following: zero initial state values with the 


initial state variance-covariance matrix 


I (p/ J) pLDiag(1) 
er —_ a 
r [pit (1?*\I (po/ Vx \LDiag(1) 
Vat}. =k; =o, 

4 pUDiag(1) ——_{pe/ Vk" )UDiag(1) I 
T 
Ci 

(por vlic™ )UDiag(1) (o/ x") UDiag (1) (piv \I 


As before all parameters apart from p for SaE are known. 


The estimated parameters in this model are: Or; o., O; and p. 


4.2 Alternative formulation of the full model 
Alternatively the full model can be formulated as: 
y, = Tiss (7, +; +I,)+b+er 


y, =e, +0 


(Po! Vi )LDiag(1) 


(p/x"" )LDiag(1) 
(pi Ve JI 


(1/«" I 


In this model, the treatment series y, is replaced by the difference series y). 


(4.5) 


(4.6) 


Unlike 


the full model, the trend, seasonal, irregular and RGB component are estimated from 
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only the control series. Similar to the difference model, this model can only be applied 
to estimate impacts during parallel collection period because the difference series are 


unspecified otherwise. 


This form of the full model demonstrates the main difference between the difference 
and the full model. As it is described, the full model uses all information available from 
historical time series and parallel collection period, although there are more parameters 
that need to be estimated than the difference model. 


The trend, seasonal, irregular and RGB component are modelled in the same fashion as 
2.3 and the impact component is modelled in the same way as the difference model (see 


3.1). We describe the SaE component as follows. 


a) SaE state equation: 


e | [®, 0 ®, 0 j/e,| | ul 

e, = 0 DM O @®, en ae u, 4 

Cl} +1 60 0. O feo 0 Sa 
Ci C12 

P| 10 1 0 Ole] | 0 


Disturbance variance-covariance matrix: 


Diag (7) Diag(y(2/V«* -1)) 0 0 

OF <0 Diag(y(0/V«*-1)] Diag(y(1/«*+1-2p/ vi*)} 0 0 (48) 
0 0 0 0 
0 0 0 0 


where covariances are calculated as below: 


Cc Cc 


Dy i c\_ on C\_ 2 * 2. > 
CaUp ) = cov(ur uy, —ux,)=cov(us,uj,)—var (ur, ) = py,o; / Ki —Vi00 = 
= 2 | * 


KF initial condition is zero initial state values with initial state variance-covariance 


cov(u 


matrix: 
I Diag| olde -I LDiag (1) olDiag| alk - 
ee 
e — —— —— 
» Diag|p/ Vk" -1 Diag(I/x"*+1-2p/ Vk") olDiag{p/ Vk -1 ol.Diag{I/«""+1-20/ic* 
Var| ¢ |=P) =o, _ 
es, pUDiag 1) olDing| ole” -! I Diag| ole - 
D 
Li. — — — 
otDing| pV -1 otDing| 1/1” +1-2p/Vk™) Diag| pe" -1 Diag|I/""+1-2p/ Vk" 


The estimated parameters (MLE) for this model are: O;, o o and p. 


@? 
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5. DATA SIMULATION AND OVERALL IMPACT ASSESSMENT 


5.1 Data simulation 


Each series used in this study is simulated with SAS. A hundred pairs of control and 
treatment series for employment and unemployment are simulated for each rotation 
group under different correlation between treatment and control samples (the 
correlation is unknown in real life). RG series are then reformulated into wave series by 
time in survey. When the ratio between sample size of treatment and control series is 
different for parallel collection and phase-in periods, we simulate two sets of series with 
different parallel collection and phase-in ratios. We then replace the treatment series 
from the first set with the treatment series from the second set when phase-in begins. 
The simulated data are generated from June 2006 to December 2013. It is expected that 
the beginning month for actual SIM will be June 2006, however, the last month in series 
will be the last month before transitioning LFS to new environment. Control series are 
replaced by missing values from October 2012 and treatment series are replaced by 


missing values until parallel collection begins, which is in June 2011. 


We use the parallel collection scheme that will most likely be implemented in LFS 
transformation period if parallel collection is available. However, simulated impacts are 
simulated from June 2011 unlike in real life (the exact time has yet to be decided). 
There are two schemes under consideration — with 16 and 19 months of parallel 
collection. The 19 month scheme is presented in Figure 5.1. 


5.1 Simulation scheme, 19 months of parallel collection 


wave 


Control 
ONO UDRDWNEPANAURAWNE 


Treatment 


dE 2 3 4 5 6 v4 3 9 10 1212 12 13 14 15 16 17 18 19 


The scheme without parallel collection is presented in Figure 5.2. 


5.2 Simulation scheme in case of no parallel collection 


wave 


Control 
ONOUDRWNREANGAUDWNER 


Treatment 
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Each simulated series contains a few components that are obtained separately: 


1. “Real world component” that is obtained from the national (published) 
unemployment / employment series (on log scale) by removing the sampling error with 
the univariate state space model. The same component is used to generate all waves in 
both treatment and control series. 


2. SaE is generated for eight waves as white noise for wave 1, AR(1) process for wave 
2 and AR(2) process for waves 3-8 (see Appendix 2 for more details). Note that 
observations are independent among waves at a particular time point and within waves. 
Therefore, waves follow an AR process only with their lagged values from the previous 
waves. AR parameters and variance of SaE disturbance used in the simulation are 


estimated from real LFS data at the rotation group level (see section 2.3) and they are in 
log scale. Additionally, treatment series SaE disturbance variance is scaled by a factor 
that is an inverse of the ratio between sample size of treatment and control group. 
There are a few scenarios with different ratio for parallel collection and phase-in periods. 


3. Rotation group bias (RGB) is estimated from real LFS data as the ratio of GREG 
estimate for wave 7 to GREG estimate for wave seven and is added to the series for all 


waves apart from wave seven (see Table 5.1). Note that in log scale, RGB is a difference 
rather than a ratio. RGB is estimated across the same ten year period as the other 
predefined components described above. Monthly GREG estimates are obtained by 
calibrating each rotation group to the national level benchmarks and then they are 
averaged across time. Note that wave seven is a reference wave although any wave can 
be the reference wave or average across waves. The choice of a reference wave does not 
influence the accuracy of any time series component and the size of the components 


except the trend level. 


4. A wave specific permanent level shift Gmpact, RGBD) in log scale is added to the 
treatment series (see Table 5.1). This impact is generated so that the overall impact is 
equal to one standard error of the published estimate (2.5% for unemployment and 
0.37% for employment). Note that the overall impact is obtained by averaging the 
product of the exponential of wave impacts and the composite estimator weights of the 
current month CWO across waves. 


5.1 Simulated RGB and impact 
a) unemployment 


wave RGB RGBD (impact) CWO exp(RGBD) CWO0*exp(RGBD) 
1 0.0814 0.1000 0.7941 1.1052 0.8776 
2 0.0499 0.0500 0.9823 1.0513 1.0326 
3 0.0376 0.0350 1.0240 1.0356 1.0605 
4 0.0260 0.0250 1.0342 1.0253 1.0603 
5 0.0059 0.0100 1.0375 1.0101 1.0479 
6 0.0156 0.0035 1.0393 1.0035 1.0429 
7 0 0.0000 1.0444 1.0000 1.0444 
8 -0.0125 -0.0100 1.0444 0.9900 1.0340 


total impact 
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b) employment 


wave RGB RGBD (impact) CWO exp(RGBD) CWO0*exp(RGBD) 
1 0.0046 0.0500 0.7941 1.0513 0.8348 
2 0.0000 0.0250 0.9823 1.0253 1.0071 
3 -0.0012 0.0100 1.0240 1.0101 1.0343 
4 -0.0008 0.0050 1.0342 1.0050 1.0393 
5 0.0009 0.0005 1.0375 1.0005 1.0380 
6 0.0002 0.0001 1.0393 1.0001 1.0394 
7 0 0 1.0444 1.0000 1.0444 
8 0.0001 -0.0510 1.0444 0.9503 0.9924 


total impact 0.37% 


The data generation procedure described above is repeated 100 times. This gives us 100 
replicates of eight control and treatment wave series (8*2*100 series) for each scenario. 
All series are generated in log scale. Note that all components used in the simulation are 
obtained or estimated from the LFS data to ensure the simulated series are as “realistic” 
as possible. 


5.2 Overall impact assessment 


The main question to answer with our study is the size of the impact that can be 
detected by the model at the end of parallel collection and at the beginning of phase-in 
period. The size of detectable impact can be measured by minimum detectable impact 
ratio (MDIR), which is defined as the size of the impact that can be detected based on a 
stated accuracy criterion: 


woe (5.1) 
8— 


where 2.8 is a multiplier derived from a set of predefined type I and II errors (5% and 
20% respectively), SE(@) is the standard error (SE) of detected impact obtained from 


the model and A is one SE of unemployment or employment according to ABS LFS 
publications (on average 2.5% for unemployment and 0.37% for employment). 


Alternatively we could fix a size of an impact that we would like to detect (MDIR) and 
assess the power of detection of that size of the impact. The power then can be 
estimated as: 


(5.2) 


MDIR-A MDIR-A 
os ~ 40.975 
SE(a@) SE(a@) 


power =1—® Ce = 


where © is the standard Normal distribution function. 


We can obtain actual empirical impact SEs as well as modelled ones from the simulation 
results. The purpose of comparing modelled SEs with empirical ones (they are available 
only in simulation study) is to examine if the modelled SE is an unbiased estimate of the 
true SE. 
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e 


Empirical standard errors s** of an overall impact are calculated as the standard 


deviation of the overall impact across 100 replicates: 


(555) 


where @, is the overall impact in replicate 7 calculated using the composite estimator 
weight (BLUE multiplier) for current month w, and taking exponential of wave level 


impacts @,: 
a= ee id *100, (5.4) 


and @ is the average of a, over100 replicates. Note that wave level and overall impacts 


are on different scales: wave impacts are in log scale whereas the overall impact is 
measured in %. 


Modelled SE of an overall impact is estimated as the average SE of an estimated overall 
impact across 100 replicates: 


(5) 


(5.6) 


where c is a correction factor that accounts for dependency of impacts in waves (c=1.92 
for unemployment and 2.46 for employment), 


5, is the SE of an estimated impact in wave 7 of replicate 7 obtained from the model. 
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General form of correction factor c 


If we calculate the variance correctly then 
8 a\_ Qe. a a : a 4% \ 
var(> we \= var(e )+ 2 does cov(e ge 
= Dea (e) vara) +2 Th wma fvar(ay)var(ay) 
= DW le var | @, Danes MMe 8 Py-inf Val |; ) VAT | (5.7) 
Ignoring the dependence of wave impact estimates we are getting: 
var > we! ~ > we (e” ) var (e“" (5.8 
isl i ~ Logins 8) 
Then dependence correction factor is defined as: 
a ¢ 7 8 ay Ay 
‘le var(a@,)+2>)_ > wine 'e"O., [var (a, ) var (@,;] 


8 of a,\ a, 
a, (e ‘| var(e ') 


To make it easier, assume var (a; ) = o; across waves and e“’ is approximately 1. Then 


(5.9) 


we get that 


Doi +22 yd 
i=l ig i=l k>i WW Pri 


ca [Avett aint Resist (5.10) 


va" 
ae 


for lagged correlations g,_, defined by the corresponding autoregressive process in 


sampling errors. 


6. COMPARISON OF DIFFERENT SIM MODELS 


6.1 Models for parallel collection 


There are two models that are tested for measuring impacts during parallel collection — 
the full model and difference model. Impact estimates and their SEs (empirical and 


modelled) at the wave level are presented in Appendix 3. There following provides a 
summary of the results: 


e Impact estimates are unbiased (converging to the true value) however, there is a 
small gap between estimated and true (simulated) impacts with a short parallel 
collection period. 


e The modelled SEs are close to empirical SEs. However, both the full and difference 
model slightly overestimated the impact SEs. 


e The greater the correlation p between SaE of treatment and control sample, the 


smaller the gap between the estimated and simulated impacts. In addition, the 
impact estimate SEs are lower with a higher correlation. 
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e The smaller the ratio «’* between sample size of treatment and control group during 
parallel collection period’, the greater the gap between the estimated and simulated 
impacts. In addition, the impact SEs are higher with a lower ratio. 


e SEs increase by waves. This is because there is a different number of observations in 
each wave due to the gradual introduction of the parallel sample (see Figure 5.1). For 
example, in the eighth month since the beginning of the parallel collection (the first 
month when all waves have been introduced to the new approach), treatment sample 
has eight observations for wave 1 and only one observation for wave 8. The 
difference becomes smaller over time due to a smaller relative difference in the 


treatment series length of each wave. 


e The distribution of correlation estimates p (not presented here) across 100 


replicates does not violate the assumption of normality significantly. The standard 


error of correlation estimate depends on the correlation value, i.e. the greater the 
simulated correlation, the more precise the estimate of correlation is. On average, 
the full model estimates correlation slightly better than the difference model. 


e The full model also outperforms the difference model in terms of the efficiency and 
the true value convergence rate of the impact estimates. The improvement in the full 
model performance depends on the length of parallel collection and the correlation 


between control and treatment SaE (p). On average, there is a lesser improvement 
as the length of parallel collection and correlation increases. E.g. for p =O 
(unemployment) there is, on average, a 25% and 17% improvement in SE at month 8 
and 19 respectively. With employment data, the improvement is slightly smaller 
(about 20% for rho=0 with 8 months of parallel collection) but the pattern is still the 
same. 


e Alternative formulation of the full model (results are not presented here) performs 
(almost) the same as the full model. 


Overall impact estimates and their standard errors (empirical and modelled) are 
presented in Figure 6.1-6.2. It can be seen that the conclusions made from the wave 
level results are also consistent with the overall level results. Note that the modelled SEs 


are corrected on wave impact estimates dependence. They are close to empirical SEs. 


Uncorrected SEs (not presented here) are on average twice smaller than corrected ones 
for unemployment and about 2.5 times smaller for employment. 


+ We tested scenarios with Kk’ =0. 3, 0.5, 0.8 and 1 although in Appendix 3, the only result that is presented is 


the scenario with k* =1, 
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6.1 Simulated (dotted line) and estimated (solid lines) overall impact during parallel collection 
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6.2 Empirical (dotted line) and modelled (solid line) overall impact SE during parallel collection 


period 

a) full model, unemployment, p.p. b) difference model, unemployment, p.p. 
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Empirical power to detect an impact (Table 6.1) reflects the power to detect an impact if 
the true value of the SE of impact is used. However, we do not have empirical standard 
errors in practice and therefore, we only have the estimated impact SEs from the model. 


Modelled SEs (Table 6.2) are slightly overestimated so therefore the power to detect an 
impact is slightly smaller than the power calculated with empirical (true) SEs. From the 
full model, the power to detect an impact that is the size of one SE (1 SE = 2.5% for 
unemployment and 0.37% for employment) is between 45% ( =0) and 55% (p =0.3) 


> The beginning month at the plots is the eighth month since beginning of parallel collection period (when all 
eight waves have been phased-in). 
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for unemployment depending on correlation p (it is expected that correlation will be 


between 0% and 30%) and about 50% for employment (expected correlation is about 
50%). The difference model power is significantly smaller, 34-46% for unemployment 
and about 46% for employment. 


6.1 Empirical power (%) of detecting overall impact in 19 month of parallel collection 


unemployment employment 
rho 0 0.3 0 0.3 0.5 
kappa 1 1 1 1 1 
1 SE Difference model 43.1 56.2 31.2 41.4 53.2 
Full model 47.6 59.8 35.6 45.7 56.7 
1.5SE Difference model 76.5 88.8 59.7 74.4 86.4 
Full model 81.3 91,2 66.5 79.3 89.1 
2 SE Difference model 94.7 98.9 83.7 93.7 98.3 
Full model 96.7 99.3 88.9 95.9 98.9 


6.2 Modelled power (%) of detecting overall impact in 19 month of parallel collection 


unemployment employment 
rho 0 0.3 0 0.3 0.5 
kappa 1 1 1 1 1 
1 SE Difference model 34.3 46.1 25.7 34.6 45.7 
Full model 44.6 54.5 30.9 39.2 50.0 
1.5SE Difference model 64.6 79.7 50.0 65.1 79.3 
Full model 78.1 87.5 59.1 71.6 83.6 
2 SE Difference model 87.5 96.1 74.3 87.9 95.9 
Full model 95.4 98.6 83.2 92.1 97.5 


6.2 Models for phase-in 
1) Case of no parallel collection 


In case there is no parallel collection two models can be used for impact measurement — phase- 


in model and full model. Note that we use phase-in scheme presented in Figure 5.2®. The two 
models give (almost) identical results so below (Figure 6.3-6.6) we present only results from the 
full model. 


As expected, impact SEs are much greater in case of no parallel collection compared to case with 
parallel collection at both wave and overall levels. Therefore the power of impact detection is 
much greater for the case with parallel collection (Table 6.3). Such in case of no correlation 
between treatment and control SaEs the power to detect an impact of size of one SE of 
unemployment is about 18% without parallel collection (19 months since phase-in) and 45% 
with 19 months parallel collection (by the full model). If there is positive correlation between 
treatment and control groups, the power becomes greater if parallel collection is used but it 
stays the same without parallel collection’. 


° Note that for phase-in we simulated series with rho=0 and we used k’" instead of ik’ in initial covariance 
matrix for SE in the full model 

’ Note that correlation does not play any role for phase-in as there are no overlapping treatment and control 
samples 
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6.3 Simulated (dotted line) and estimated (solid lines) impact after phase-in®, wave level 


a) unemployment 
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b) employment 


6.4 Empirical (dotted line) and modelled (solid line) SE of impact after phase-in, wave level 


a) unemployment 


b) employment 


2013/10 
2013/11 
2013/12 


0.06 stem_rgbd1 0.008 
stem_rgbd2 0.007 as 
0.05 = stem_rgbd3 : 
moc. 0.006 
A stem_rgbd4 
0.0 ——— stem_rgbd5 0.005 
0060 eEEeEyE————————— smeaeetcs stem_rgbd6 9.004 OT ASaaaaaSSSSSSSSSRETITITITITITIT: 
STS ESS og 6 aiOG Wil 6:6 0s) b00 Gre oe BOR erebie mea bI8'S stem_rgbd7 0 003 
0.02 stem_rgbd8  ~" 
eecccccce stee_rgbd1 0.002 
0.01 Coeecoros stee_rgbd2 0.001 
6:00) 8 stee_rgbd3 0.000 
. ed os eS 1h ke me le ee a ee stee_rgbd4 : 
Notnonrnaoanoxdan ANMTONOR DOD 
OOOO OO OOO Sa a evvrrerees stee_rgbd5 Oooo0o0o0o0ao0 oO 
{~~ ttt Tt TCO TOTO TONE SES a 
THN MMNMNMNMN MMM MM OM “rrrrres stee_rgbd6 TrMNMMMNMMNM M 
aod dtd dtd dt dd dtd dd Oe 2... ieecnebd? add dt dd dd a 
ooo ooo o00 0 06 2 fe oooo0o coo oa 3} 
NNNANNANANANN AW A 2. eee stee_rgbd8 NNWNANANANAN AA 
6.5 Simulated (dotted line) and estimated (solid lines) overall impact after phase-in 
a) unemployment, % b) employment, % 
2.5 0.4 Coeececcecececececcesceseeoecce 
2.0 0.3 
mee a 
ra 0.2 
; rho=0, 0.1 
2 kappa=1 
0.0 PRA== (000 
GANMStTHNHORADOAN GAN mMtTHNHORDDOAGAN 
ocooooooo0o eo dead : coooooooco oe a a 
TS, STAN EG, CS, TG OTN TE ES CS OS ER eeeeee SIN TaN. Sas CSS, TS TE ST OR OO, 
MMMMMMNMNMNMMM M NMrMNMMMNMMNMNMM'M M 
aot cot cot tt itt itt tt itt Ot et eS a aA ot ott tI Tt tt Ott Tt tl hc hl 
ooo Oo ooo oOo oo oO ooo ono ooo oo oO SO 
NNNNNNNN SS SN SN NNNNNNNSNN NS SN ON 


6.6 Empirical (dotted line) and modelled (sol 


id line) SEs of overall impact after phase-in 
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® The beginning month in the plots is the eighth month since the beginning of phase-in (when all eight waves 


have been phasea-in). 
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6.3 Power (%) of an overall impact detection (full model) 19 months since phase-in commencement 


a) empirical b) modelled 

unemployment employment unemployment employment 
rho 0 0 rho 0 0 
kappa 1 1 kappa 1 1 
1 SE 12.4 12.7 1 SE 17.8 13.2 
LSE 22.6 23.2 1.5 SE 34.3 24.4 
2 SE 36.4 37.4 2 SE 54.6 39.3 
2.5 SE 52.2 51.8 2.5 SE 73.7 54.2 
3 SE 67.6 67.5 3 SE 87.5 70.1 


2) Case with parallel collection 


If there is parallel collection, there are two models that can be used for impact measurement 
after parallel collection period — the phase-in model with KF initialisation of impact states 
estimated from the difference model, and the full model. Because the difference model does 
not performs as well as the full model during parallel collection period, then the phase-in model 
with KF initialisation is not a good alternative of the full model for phase-in as well. Further we 
consider the full model performance only. Also we limit results only to the case with zero 
correlation and both kappas equal to one. Note that we use parallel collection and phase-in 
scheme presented on Figure 5.1. 


Estimated impacts and their SEs at the wave level results are presented in the Appendix 4 and 
overall results are presented in Figures 6.5-6.6. The results show that the only improvement in 
impact SEs is during phase-in period and almost no improvement after phase-in. As standard 
errors are hardly improving after phase-in the power of impact detection is not increasing as 
well. 


6.7 Simulated (dotted line) and estimated (solid lines) overall impact after phase-in 


a) unemployment, % b) employment, % 

2.50 0.40 

2.25 0.35 ce A alt bl tae le ial 

ae 0.30 

1.50 0.25 

1.25 0.20 

:; 0=0;kapp 0.15 

ee ho= =1_0.10 

0.50 rho=0.3, Kappa=1__©. rho=0.5, kappa=1 

0.25 0.05 a ea 

0.00 0.00 : 
AN MT NOR WOH OO AAN A NMT NOR WD OATAN 
ooo oo oo0oona a oooooqo 0c 0 oO Ona a 
A MT TR TT TE, TE OTSA. ME Meas. TB, Ms TS, TES TE, TES TTS TS TE TE 
NMAMMMMNMMNMMNMM M MMMMNMNMMNMNMMNMM 'M 
a co ot tt itt itt itt tt itt tt oO cc aot ct i ctti i tt ict tt itt itt iO oO eS 
OOOO 0OOWOlUCUODUCOOWCUCOCUCODUCUCOUCUO OOOO OOOO OOO Oo 
NNNNNNNN NN IN NNNNNNNNS SN SOS 


6.8 Empirical (dotted line) and modelled (solid line) SE of overall impact after phase-in 


a) unemployment, p.p. b) employment, p.p. 
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7. DISCUSSION 


In this paper we have presented an integrated approach using multivariate time series 
models in state-space form that can incorporate a period of parallel collection with a phase- 
in of the new sample design. We have fully specified the initial conditions for the sampling 
error process within the KF and demonstrated that achieving high power to detect the 
desired size of impact will be difficult without a considerable period of parallel collection. 


However, while overall power to detect an impact of 1 SE does not reach 80% for practical 
options, the integrated approach based on the full model delivers important gains in power 
relative to the difference model during parallel collection and consequently the phase-in 
period with impact state initialisation. The gain derives from the model using the historical 
data to remove the common time series effects, rather than just differencing during parallel 
collection. It is likely that further small gains would be possible using a longer series. This 
integrated approach also allows for further gains to be explored when additional auxiliary 
series are available that share correlated components with the series estimated from the 
SUIVEY. 


A second advantage derives from the fully integrated nature of the full model. Parallel 
collection and phase-in are not separate models, and ‘initialisation’ follows naturally 
from the end of the parallel collection into the phase-in period. This flexibility allows 
there to be either a gap between parallel collection and phase-in or a seamless transfer. 
With the difference model, parallel collection must finish to allow the calculation of all 
eight effects for initialisation of the phase-in period. 


The work presented here has focused on estimating impact at the national level. While 
achieving a robust measure of an impact at this level is crucial, should an impact be detected 
it will need to be removed from the published estimates at both national and lower levels. 
Therefore, the next phase of development will focus on implementing the approach on 
time series at a sub-group level, with a benchmarking constraint to the national series. Such 
an approach is implemented by Bureau of Labour Statistics (USA) to produce their sub- 
national labour market estimates. Further work, not presented here, is developing an 
approach utilising constrained optimisation techniques to ensure adjusted series at the 


lowest levels remain consistent with the levels at which we can estimate impact directly. 
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APPENDIXES 


Appendix 1. Sampling error variance and autocorrelation calculation 


There were two methods used for SE autocorrelation calculation in this study: pseudo- 
errors method developed by Pfeffermann et.a/. (1998) and GREG weighted residuals 
method. SE variance was estimated using weighted residuals method. Below we 
describe these two methods in more detail. 


Pseudo-errors method 


1. 


. For each panel calculate average: e, 


Calculate GREG estimates for number of employed people and number of 
unemployed people by rotation groups monthly for period from June 2006 till May 
2016. GREG estimates have to be obtained by calibrating independently each rotation 


group to overall population benchmarks. 


. To each rotation group assign respective time-in-survey (wave, panel) /. 


.For each month within the time period calculate average estimate y, (for 


employment and unemployment) across rotation groups by formula y, = - yl ) / 8 , 
where & is the number of rotation groups in LFS. 


(J) (yj) 


. For each panel calculate pseudo panel-survey errors e,” = y/”—y,, where yy” is a 


GREG estimate (of employment or unemployment) for panel7 at month?. 


()_ 1 Nell 


ra » » Where N is the number of months 


in the time period. 


. For each panel 7 ~ calculate averages at different lags Rk ( kR=U...10) 


N+k ; 
‘=> " where e/",, is the pseudo error at month (+R) corresponding 


t=k+1 e. in -p? »p 
to the panel selected from the same rotation group as the panel enumerated for the 7 


th time at month? (in particular, e”* = el y, 


. For each panel calculate covariances C/ and variances Cj at various lags (including 


k=0 for variance), R=0,1,...10: C! -—>, (0) -e (eit elk). 


t=k+1 


. Calculate sampling error autocorrelations at the national level? at lags R=J....10 : 


Py = ae, Cc poe CG; 


. Calculate sampling error autocorrelations at the panel level at different lags: 


” Note overall (national level) sampling error autocorrelations and variances were used in simulation study to 


formulate “real world component” in simulated series by waves. 
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GREG weighted residuals method 
In calculation the following steps were taken: 


1. For calculation of SE variance by rotation groups the unit-level LFS data files need to 
be split by rotation groups and calculation can be then conducted for each rotation 
group separately following steps described below. 


e Calculate individual weighted residuals for each person on the file 
W; (y i x,B ) ) 


where w, is a GREG weight, y, is a labour force status (employed, unemployed) 
for person 7, x,8 is a GREG prediction for person 7 obtained from calibration to 


overall benchmarks (population totals by age, sex etc.) 


Calculate sum of weighted residuals for each variance group g in each stratum 4 


by labour force status (employed, unemployed): ¢,, = a w,(y,-~%8). 


The variance groups were assigned systematically to the sorted list of Primary 
Sampling Unit blocks. There were 32 variance groups used for estimation of 
overall sampling error variance and 20 variance groups for SE variance estimation 
by rotation groups (i.e. 20 in each rotation group). 

e Calculate sum of weighted residuals for each stratum 4 divided by number of 
variance groups in a stratum by labour force status: e, =) _,w,(y,-%,8)/32. 
Note there will be 20 instead of 32 for variance calculation by rotation group (here 
and in following steps). 


e Calculate weighted residuals variance for each month in calculation: 
A 2 
Vary) DC) -¢,) 
2. For each pair of months at the various lags within the time period calculate: 


e variance of movement between pair of months at various lags (dag k =1,..,10): 
Var(y" a y°) ) 


e covariance between pair of months at various lags: 


Cov(¥', 3°) = 1/2(Var(5')+Var(5?)—Var( 5" -¥)) , 


e correlation between pair of months at lag Rk : Corr(3', 3”) = 
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Appendix 2. Sampling error simulation 


The structure of the errors in the wave level time series is complex because the AR(2) 
structure does not operate within a wave but as a lagged dependency across waves. For 


the first rotation group, we then create the following sequence of sampling errors: 


2 
€11~N(0, 0%) 
= 2 
C27 = 044041 + U22, Uz2~N (0,03) 
Cag = 051059 + Oo9@1, FU U33~N(0,07) 
33 21€22 22€11 33) 33 ,03 
_ 2 
C44 = 021033 + 022€22 + Udy, Us4~N (0, 04) 
_ 2 
Cgg = 021€77 + 022066 + Ugg, Ugg~N (0, 03’) 


Cg, ~N(0, o*) 
U10,2~N(0, os) 
U11,3~N(0, of) 


€10,2 = 641€91 + U102, 


C113 = 021€192 + 622€91 + U113, 


where a” is the wave level sampling variance for the LFS, the o;’s are the parameters that 
define the AR(2) process within a rotation group, and the o7’s are the appropriate white 
noise disturbance variances to ensure the overall variance of e;; is always the estimated 
sampling variance 07. Note that these disturbance variances need to be derived as we 
do not have a standard stable AR(2) process, it re-starts with a new uncorrelated error 
after every eight time-points. However, of is very close to the long-term disturbance 
error variance in a long-sequence AR(2) process. 


For the second rotation group, we then create the following sequence of errors 
independent of the first rotation group: 


2 
€2,~N(0, a”) 
_ 2 
C39 = 044€21 + U32, U32~N(0, 03) 
os. 2 
C43 = 02132 + 022€21 + Uy3, U43~N (0,03) 
= 2 
C54 = 021043 + 022€32 + Usa, Us4~N (0,07) 
= 2 
Cog = 021€g7 + 622€76 + Uog, Ugg~N (0, 03) 


€10,1~N(0, o7) 


C1142 = 041€101 + U112, 


C123 = 0216112 + 622101 + Uy23, 


U11,2~N(0, os) 
U123~N(0, 0%) 


utilising the same parameters as for the first rotation group, but starting with first time in 
survey at time 2. We repeat this independent generation up to the eighth rotation 


group, with the following sequence of errors: 


€3,~N(0, a”) 
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Coz = 011091 + Use, Ug2~N (0,03) 
€10,3 = 62192 + 622€g1 + U93, U193~N(0, of) 
€11,4 = 621€10,3 + 022092 + U11,4, U11,4~N(0, 04) 
€15,8 = 021€14,7 + 0220136 + Ui5,8, Uisg~N(0, 08) 
€161~N(0, o*) 
€17,2 = 0114161 + U17,2; U17,2~N (0,03) 
€18,3 = 021€17,2 + 6220161 + Uig3, U1g3~N (0,03) 


Once we have the errors as sequences in rotation group structure, we can re-structure 
the errors into the wave form 


C21 ©22 
€31 €32 €33 
C41 42 C43 Cag 


C51 €52 @53 C54 C55 
C61 &62 &63 C64 65 66 
C71 72 €73 ©74 Cyn Cre 


C77 
€gi &g2 ©€83 ©84 Cgn ge 


C37 &gg 


C91 €92 93 C94 C95 C96 97 Cog 


so that the first wave is first-time-in-survey for each rotation group, second wave is 
second-time-in-survey, and so on. 


We repeat the error generation process for é;; using the same AR parameters but using 
the sampling variance 6*. The &;; are generated such that their correlation with e;; is @. 
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Appendix 3. Impact estimates and their SEs during parallel collection 
period, wave level*® 


1) Unemployment 
A3.1 Simulated (dotted line) and estimated (solid line) impact 


a) full model, rho=0O, kappa=1 b) difference model, rho=0O, kappa=1 
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Pec sim7 
errr ee sim8& 
c) full model, rho=0.3, kappa=1 d) difference model, rho=0.3, kappa=1 
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A3.2 Empirical (dotted line) and modelled (solid line) impact SE 
a) full model, rho=0O, kappa=1 b) difference model, rho=0, kappa=1 
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'° The beginning month at the plots is the eighth month since beginning of parallel collection period (when all 
eight waves have been phased-in). 
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c) full model, rho=0.3, kappa=1 d) difference model, rho=0.3, kappa=1 
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A3.4 Empirical (dotted line) and modelled (solid line) impact SE 


a) full model, rho=0, kappa=1 
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d) difference model, rho=0.3, 
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e) full model, rho=0.5, kappa=1 
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Appendix 4. Impact estimates and their SEs after phase-in, wave 
level** 


1) Unemployment 


A4.1 Simulated (dotted line) and estimated (solid line) impact 


a) full model, rho=O, kappa=1 b) full model, rho=0.3, kappa=1 
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A4.2 Empirical (dotted line) and modelled (solid line) impact SE 


a) full model, rho=O, kappa=1 b) full model, rho=0.3, kappa=1 
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'! The beginning month at the plots is the eighth month since beginning of phase-in period (when all eight 
waves have been phased-in). 
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c) full model, rho=0.5, kappa=1 
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b) full model, rho=0.3, kappa=1 
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